Speeding up Xml Querying

نویسندگان

  • Jinghua Groppe
  • Volker Linnemann
چکیده

This dissertation develops approaches to testing the satisfiability and the containment of XPath queries in the presence of XML Schema definitions in order to speed up XML querying. XML provides a simple yet powerful mechanism for information storage, processing and delivery, and is a widely used standard data format. XPath is a basic language for querying XML data, and is embedded into many W3C standards, e.g. XQuery, XLink, XML Schema, XForm and Schematron, for addressing XML data. Therefore, XPath optimization plays a key role in speeding up XML query processing. The satisfiablity test and containment test of XPath are two important issues in XPath optimization. An unsatisifable XPath query selects every time an empty result. Therefore, the application of the satisfiability test can avoid the unnessesary submission and the unnecessary evaluation of unsatisfiable queries, and thus can save querying costs. In programming languages, which embed XPath, like XOBE [Kempa and Linnemann 2003a], the satisfiability test can enable an efficient development of more robust applications by avoiding extensive tests and runtime failures caused by unsatisfiable queries. The satisfiability test can also speed up the execution of codes by the pre-computation of an empty result at compile time. Furthermore, the XPath satisfiability test plays an important role in other applications, e.g. XML access control [Fan et al. 2004], type-checking of transformations [Martens and Neven 2004] and XPath-based index update [Hammerschmidt et al. 2005]. The containment of XPath is another key factor for XPath evaluation. XPath containment can be used to minimize XPath expressions to speed up query evaluation. When using views to answer queries, the containment test is the underlying technique to decide if a new query can be answered using the results of previous queries. Using views to answer queries can significantly improve the performance of XPath processing, and reduce the communication and query costs by significantly decreasing shipped data, since part of query evaluation has bee done when computing the cache, and since the partial or even the whole answer to the new query is already available at client side. XPath containment can also find its applications in inferring the keys of XML Schema and in testing the satisfiability of XPath queries. Since the high complexity of XPath queries, it is not trivial to develop efficient approaches to checking XPath satisifiability and to checking XPath containment when schemas, especially recursive schemas, are in presence. [Choi 2002] shows that recursive schemas are often used in the real world. The existing solutions to XPath satisfiability consider only some subsets of XPath axes and non recursive schemas. In this thesis, we propose an approach to XPath satisfiability in the presence of XML Schema definitions, and support all XPath axes, and recursive as well as non-recursive schemas. Since XPath containment has a high complexity under constraints, there is lack of work on practical solutions to this issue. In this work, we develop an approach to checking XPath containment under constraints of XML Schema definitions. Furthermore, we develop a data model for XML Schema and an XPathXSchema evaluator based on the data model. We as well suggest an approach to rewriting and optimization of XPath expressions according to schemas. Our XPath-XSchema evluator evaluates XPath queries on an XML Schema definition, in order to check satisfiability and containment of XPath expressions with respect to the schema. We present a complexity analysis of our XPathXSchema evalutor, which proves that our approach is efficient at typical cases. We present an experimental analysis of our satisfiability tester, which proves the optimization potential of avoiding the evaluation of unsatisfiable queries. We prove the correctness of our approach to XPath containment, and analyze the complexity of our approach. We develop a prototype of our containment tester and the experimental results show the efficiency of our approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Integrated Access Control for Securely Querying and Updating XML Data

Many existing access controls use node filtering or querying rewriting techniques. These techniques require rather time-consuming processes such as parsing, labeling, pruning and/or rewriting queries into safe ones each time a user requests a query or takes an action. In this paper, we propose a fine-grained access control model, named SecureX, which supports read and write privileges. With our...

متن کامل

Speeding up XML querying: satisfiability test & containment test of XPath queries in the presence of XML schema definitions

This dissertation develops approaches to testing the satisfiability and the containment of XPath queries in the presence of XML Schema definitions in order to speed up XML querying. XML provides a simple yet powerful mechanism for information storage, processing and delivery, and is a widely used standard data format. XPath is a basic language for querying XML data, and is embedded into many W3...

متن کامل

Storing and Querying XML Documents Without Using Schema Information

As the popularity of eXtensible Markup Language (XML) continues to increase at an astonishing pace, data management systems for storing and querying large repositories of XML data are urgently needed. In this paper, we investigate using a Relational Database Management System (RDBMS) for storing and querying XML data. We present a mapping scheme, called PAID, for mapping XML documents to relati...

متن کامل

XPath Extension for Querying Concurrent XML Markup∗

XPath is a language for addressing parts of an XML document. It is used in many XML query languages and it can be used by itself for querying XML documents. While XPath is, in general, efficient for querying individual XML documents, it lacks the features for querying over collections of documents or joining parts of the same document. As the amount of complex document-centric XML data is conti...

متن کامل

Concept based querying of semistructured data

In the last years, semistructured data has played an increasing role within the database community. Many query languages have been developed for querying semistructured data and in particular XML data sources. XML data often is described by means of DTDs and more recently through XML schemas. This paper is about querying semistructured data by making use of the schema and the types described th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008